A machine model for data stream computation

نویسنده

  • Sumit Ganguly
چکیده

We consider online and space-bounded computations in the data stream processing model where a stream is a sequence of records of the form (i, 1) signifying insertion of item i, or, (i,−1), signifying deletion of item i, where, i ∈ {1, 2, . . . , n}. This model finds applications in network monitoring, approximate query answering in databases, computational geometry, where, items signify points in d-dimensional space, graph streams, where, items signify edges of a streaming graph and compressed sensing. We abstract computations over data streams using the stream automata model of computation. Our main result is that a certain natural generalization of the transition function of any stream automaton is essentially a linear mapping. This is used to derive space lower bounds for deterministic data stream computations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Three New Systematic Approaches for Computing Heffron-Phillips Multi-Machine Model Coefficients (RESEARCH NOTE)

This paper presents three new systematic approaches for computing coefficient matrices of the Heffron-Phillips multi-machine model (K1, …, K6). The amount of computations needed for conventional and three new approaches are compared by counting number of multiplications and divisions. The advantages of new approaches are: (1) their computation burdens are less than 73 percent of that of convent...

متن کامل

Stream-based statistical machine translation

We investigate a new approach for SMT system training within the streaming model of computation. We develop and test incrementally retrainable models which, given an incoming stream of new data, can efficiently incorporate the stream data online. A naive approach using a stream would use an unbounded amount of space. Instead, our online SMT system can incorporate information from unbounded inco...

متن کامل

Comet: Batched Stream Processing in Data Intensive Distributed Computing

Performance and resource optimization is an important research problem in data intensive distributed computing. We present a new batched stream processing model that captures query correlations to expose I/O and computation redundancies for optimizations. The model is inspired by our empirical study on a trace from a production large-scale data processing cluster, which reveals significant redu...

متن کامل

Communicating Stream X-Machines Systems are no more than X-Machines

A version of the communicating stream X-machine model is proposed, which gives a precise representation of the operation of transferring data from one X-machine to another. For this model it is shown that systems of communicating Xmachines have the same computational power as single stream X-machines. This enable existing methods for deriving test strategies for stream X-machines to be extended...

متن کامل

Evaluation of a New Incremental Classification Tree Algorithm for Mining High Speed Data Streams

Abstract—A new model for online machine learning process of high speed data stream is proposed, to minimize the severe restrictions associated with the existing computer learning algorithms. Most of the existing models have three principle steps. In the first step, the system would create a model incrementally. In the second step the time taken by the examples to complete a prescribed procedure...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007